Lowering C11 Atomics for ARM in LLVM
نویسنده
چکیده
This report explores the way LLVM generates the memory barriers needed to support the C11/C++11 atomics for ARM. I measure the influence of memory barriers on performance, and I show that in some cases LLVM generates too many barriers. By leaving these barriers out, performance increases significantly. I introduce two LLVM passes, which will remove these extra barriers, improving performance in my test by 40%. I believe one of these passes is ready to be upstreamed to LLVM, while the other will need more testing.
منابع مشابه
Overhauling SC atomics in C11 and OpenCL
Despite the conceptual simplicity of sequential consistency (SC), the semantics of SC atomic operations and fences in the C11 and OpenCL memory models is subtle, leading to convoluted prose descriptions that translate to complex axiomatic formalisations. We conduct an overhaul of SC atomics in C11, reducing the associated axioms in both number and complexity. A consequence of our simplification...
متن کاملImproving Switch Lowering for The LLVM Compiler System
Switch-case statements (or switches) provide a natural way to express multiway branching control flow semantics. They are common in many applications including compilers, parsers, text processing programs, virtual machines. Various optimizations for switches has been studied for many years. This paper presents the description of switch lowering refactoring recently made for the LLVM Compiler Sy...
متن کاملLLVM in the FreeBSD Toolchain
The most obvious incentive for the FreeBSD project to switch from GCC to Clang was the decision by the Free Software Foundation to switch the license of GCC to version 3 of the GPL. This license is unacceptable to a number of large FreeBSD consumers. Given this constraint, the project had a choice of either maintaining a fork of GCC 4.2.1 (the last GPLv2 release), staying with GCC 4.2.1 forever...
متن کاملAutomatic Generation of Assembly to IR Translators Using Compilers
Translating low-level machine instructions into higher-level intermediate representation (IR) is one of the central steps in many binary translation, analysis and instrumentation systems. Most of these systems manually build the machine instruction to IR mapping table needed for such a translation. As a result, these systems often suffer from two problems: (a) a great deal of manual effort is r...
متن کاملCode generation for a Coarse-Grained Reconfigurable Architecture
Good tool support is essential for computing platforms because they increase programmability. This is especially the case for reconfigurable architectures because applications need to be mapped on the architecture for each configuration individually. This paper introduces a compiler backend for Coarse Grained Reconfigurable Arrays (CGRA) based on LLVM. The CGRA compiler must be retargetable to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014